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Abstract — The complementary problems of masking and am- 
plifying channel state information in the Gel'fand-Pinsker chan- 
nel have recently been solved by Merhav and Shamai, and Kim 
et al., respectively. In this paper, we study a related source coding 
problem. Specifically, we consider the two-encoder source coding 
setting where one source is to be amplified, while the other source 
is to be masked. In general, there is a tension between these two 
objectives which is characterized by the amplification-masking 
tradeoff. In this paper, we give a single-letter description of this 
tradeoff. 

We apply this result, together with a recent theorem by 
Courtade and Weissman on multiterminal source coding, to solve 
a fundamental entropy characterization problem. 

I. Introduction 

The well known source coding with side information prob- 
lem has an achievable rate region given by 

R X >H(X\U), Ry>I(Y;U) 

as originally shown by Ahlswede and Korner (TJ, and inde- 
pendently by Wyner p). In this setting, the side information 
encoder merely serves as a helper with the sole purpose of 
aiding in the recovery of X n at the decoder. However, for 
given rates (R x ,R y ), there may be many different coding 
schemes which permit recovery of X n at the decoder. In 
some cases, it may be desirable to select a coding scheme 
that reveals very little information about the side information 
Y n to the decoder. We refer to this objective as masking the 
side information. 

To motivate this setting, consider the following example. 
Suppose X is an attribute of an online customer that an 
advertiser would like to specifically target (e.g., gender), and 
Y is other detailed information about the same customer (e.g., 
credit history). Companies A and B separately have databases 
X n and Y n corresponding to n different customers (the 
databases could be indexed by IP address, for example). The 
advertiser pays Companies A and B to learn as much about 
the database X n as possible. Now, suppose governing laws 
prohibit the database Y n from being revealed too extensively. 
In this case, the material given to the advertiser must be chosen 
so that at most a prescribed amount of information is revealed 
about Y n . 

In general, a masking constraint on Y n may render near- 
lossless reconstruction of X n impossible. This motivates the 
study the amplification-masking tradeoff. That is, the tradeoff 



between amplifying (or revealing) information about X n while 
simultaneously masking the side information Y n , 

Similar problems have been previously considered in the 
information theory literature on secrecy and privacy. For 
example, Sankar et al. determine the utility-privacy tradeoff 
for the case of a single encoder in (3). In their setting, the 
random variable X is a vector with a given set of coordinates 
that should be masked and another set that should be revealed 
(up to a prescribed distortion). In this context, our study of the 
amplification-masking tradeoff is a distributed version of |3), 
in which utility is measured by the information revealed about 
the database X n . The problem we consider is distinct from 
those typically studied in the information-theoretic secrecy 
literature, in that the masking (i.e., equivocation) constraint 
corresponds to the intended decoder, rather than an eavesdrop- 
per. 

We remark that the present paper is inspired in part by the 
recent, complementary works HI and (5) which respectively 
study amplification and masking of channel state information. 
We borrow our terminology from those works. 

This paper is organized as follows. Section [TT] formally 
defines the problems considered and delivers our main results. 
The corresponding proofs are given in Section III Final 



remarks and directions for future work are discussed in Section 

II. Problem Statement and Results 

Throughout this paper we adopt notational conventions that 
are standard in the literature. Specifically, random variables are 
denoted by capital letters (e.g., X) and their corresponding 
alphabets are denoted by corresponding calligraphic letters 
(e.g., X). We abbreviate a sequence (Xi,...,X n ) of n 
random variables by X n , and we let S(e) represent a quantity 
satisfying lini e ^o S(e) = 0. Other notation will be introduced 
where necessary. 

For a joint distribution p(x, y) on finite alphabets X x y, 
consider the source coding setting where separate Encoders 
1 and 2 have access to the sequences X n and Y n , respec- 
tively. We make the standard assumption that the sequences 
(X n , Y n ) are drawn i.i.d. according to p(x, y) (i.e., X n , Y n ~ 
Y[?=iP( x iiyi))> an d n can be taken arbitrarily large. 

The first of the following three subsections characterizes 
the amplification-masking tradeoff. This result is applied to 
solve a fundamental entropy characterization in the second 



subsection. The final subsection comments on the connection 
between information amplification and list decoding. Proofs 



of the main results are postponed until Section III 
A. The Amplification-Masking Tradeoff 

Formally, a (2 nRm , 2 nRy , n) code is defined by its encoding 
functions 

f x : X n — > {1, . . . , 2 nR *} and /„ : y n -> {1, . . . , 2 nR y}. 

A rate-amplification-masking tuple (R x , R y , A a, Am) is 
achievable if, for any e > 0, there exists a (2 nRx , 2 nR y , n) 
code satisfying the amplification criterion: 

A A <-I(X n ;f x (X n ),f y {Y n )) + e, 



(1) 



and the masking criterion: 
1 



A M >-I(Y n ;f x (X n ),f y (Y n )) 



(2) 



Thus, we see that the amplification-masking problem is an 
entropy characterization problem similar to that considered in 
|(6] Chapter 15]. 

Definition 1: The achievable amplification-masking region 
TZam is me closure of the set of all achievable rate- 
amplification-masking tuples (R x , R y , Aa, Am). 

Theorem 1: TZam consists of the rate-amplification- 
masking tuples (R x , R y , Aa, Am) satisfying 

R x >A A -I(X;U) 



Ry >I{Y;U) 



(3) 



A A -H(X),I(Y;U)} 

p(x,y)p(u\y), where 



A M >m&x{I(Y;U,X) 
A A <H(X). 

for some joint distribution p(x,y,u) 

\u\ < \y\ + i. 

Observe that TZam characterizes the entire tradeoff between 
amplifying X n and masking Y n . We remark that maximum 
amplification A^ = H(X) does not necessarily imply that 
X n can be recovered near-losslessly at the encoder. However, 
if an application demands near lossless reproduction of the 
sequence X n , Theorem [T] can be strengthened to include this 
case. To this end, define a rate-masking triple (R x , R y , Am) to 
be achievable if, for any e > 0, there exists a (2 nRx , 2 nRy , n) 
code satisfying the masking criterion d2l, and a decoding 
function 

X n : {l,2,...,2 nR *} x {1,2,... ,2 nR y} -> X" 
which satisfies the decoding-error criterion 



Pr 



< e. 



X n ^X n (f x (X n ),f y (Y n )) 

Definition 2: The achievable rate-masking region TZm is 
the closure of the set of all achievable rate-masking triples 
(R x , R y , A M )- 

Corollary 1: TZm consists of the rate-masking triples 
[R x ,R y ,A M ) satisfying 

R x > H{X\U) 
R v > I(Y; U) 
A M >I(Y;X, U) 



for some joint distribution p(x,y,u) 

|w|<|y| + i. 



p(x, y)p(u\y), where 



B. An Entropy Characterization Result 

As we previously noted, the amplification-masking trade- 
off solves a multi-letter entropy characterization problem by 
reducing it to single-letter form. The reader is directed to [6] 
for an introduction to entropy characterization problems. Here, 
we apply our results to yield a fundamental characterization 
of the information revealed about X n and Y n , respectively, 
by arbitrary encoding functions f x and f y (of rates R x ,R y ). 

Definition 3: Define the region TZ*(R X , R y ) as follows. The 
pair (Ax, Ay) e TZ*(R x ,R y ) if and only if, for any e > 0, 
there exists a (2 nRm ,2 nRy ,n) code satisfying 

A x --I(X n ;f x (X n ),fy(Y n )) ■ <. and 



A Y --I(Y n ;f x (X n )J y (Y n )) 



< €. 



Let TZ*(R x ,R y ) be the closure of TZ*(R x ,R y ). 

Ultimately we obtain a single-letter description of 
TZ*(R X , R y ). However, in order to do so, we require some 
notation. To this end, let: 

TZ A m(R x ,R v ) = {{A x , A Y ) : (R x ,R y ,A x ,A Y ) e TZ AM } ■ 

Symmetrically, let TZma be the region where X n is subject 
to masking A^ and Y n is subject to amplification Ay. Let 

TZ MA {R x ,R y ) = {(A x , Ay) : {R x , R y , A X ,A Y ) e TZ MA } ■ 



Finally, let TZ A a(Rx,R v ) 
satisfying 



consist of all pairs (Ax, Ay) 



R x >I(U x ;X\U y ,Q) 
R y >I(U y ;Y\U x ,Q) 
R x + R y >I(U x ,U y ;X,Y\Q) 
A x <I(X;U x ,U y \Q) 
A Y <I(Y;U x ,U y \Q) 

for some joint distribution of the form 

p(x, y, u x , u y ,q)= p(x, y)p(u x \x, q)p(u y \y, q)p{q), 

where \U X \ < \X\, \U V \ < lyj^and \Q\ < 5. 

Theorem 2: The region TZ*(R x ,R y ) has a single-letter 
characterization given by 

W(R X ,Ry) = 

TZam{R x , R y ) n TZma{R x , R y ) n TZaa{Rx, R y )- 

Moreover, restriction of the encoding functions to vector- 
quantization and/or random binning is sufficient to achieve 
any point in TZ*(R x ,R y ). 

The second statement of Theorem [2] is notable since it 
states that relatively simple encoding functions (i.e., vector 
quantization and/or binning) can asymptotically reveal the 
same amount of information about X n and Y n , respectively, 
as encoding functions that are only restricted in rate. In 




Ax 



Fig. 1. The region lZ*(R x ,R y ) for joint distribution Px,Y given by RJ 
and three different pairs of rates. Rate pairs (R x ,R y ) equal to (0.1,0.7), 
(0.4, 0.4), and (0.5, 0.6) define the convex regions bounded by the black, 
blue, and red curves, respectively. 

contrast, this is not true for the setting of three or more 
sources, as the modulo-sum problem studied by Korner and 
Marton J7| provides a counterexample where the Berger-Tung 
achievability scheme (8) is not optimal. Thus, obtaining a 
characterization like Theorem |2] for three or more sources 
represents a formidable challenge. 

We remark that the points in lZ*(R x ,R y ) with Ax = 
H(X) and/or Ay — H(Y) also capture the more stringent 
constraint(s) of near-lossless reproduction of X n and/or Y n , 
respectively. This is a consequence of Corollary [T] 

To give a concrete example of 7Z*(R x ,R y ), consider the 
following joint distribution: 



Px,Y(x,y) 


x — x = 1 


y = 


1/3 


y = i 


1/6 1/2. 



By performing a brute-force search over the auxiliary random 
variables defining 7Z*(R x ,R y ) for the distribution Px.y, we 
have obtained numerical approximations of 7Z*(-, •) for several 
different pairs of (R x , R y ). The results are given in Figure [T] 

C. Connection to List Decoding 

We briefly comment on the connection between an ampli- 
fication constraint and list decoding. As discussed in detail in 
Q, the amplification criterion ([T} is essentially equivalent to 
the requirement for a list decoder 

L n : {l,...,2 nR *} x {l,.-.,2 nJ? »} -> 2 X " 

with list size and probability of error respectively satisfying 

log \L n \ < n{H{X) - A A + e), and 
Pi[X n tL n (f x (X n ),f y (Y n ))]<e. 

Thus maximizing the amplification of X n subject to given rate 
and masking constraints can be thought of as characterizing 
the best list decoder in that setting. 



III. Proofs of Main Results 

Proof of Theorem [7J Converse Part: Suppose 
(R x ,Ry, Aa, Am) is achievable. For convenience, define 
F x = f x {X n ), F y = f y (Y n ), and U t = {F y ,Y^ 1 ). 

First, note that A^ < H(X) is trivially satisfied. Next, the 
constraint on R x is given by: 

nRx > H(F X ) > H(F x \F y ) 

n 

= H(Xi\F v , X 1 - 1 ) - H(X n \F x ,F y ) 

i=l 
n 

> H(Xi\F y , Y l -\X l - r ) - H(X n \F x , F y ) 
i=l 

n 

= I(X n ;F x ,F y )-Y I (X i ;U i ) (5) 

i=l 

>n(A A - £ )-^Jft;!7,). (6) 
t=i 

Equality ((5J follows since Xi o Fy.Y 1 ^ 1 «-» X 1 ^ 1 form a 
Markov chain, and inequality |6| follows since amplification 
A^ is achievable. 

The constraint on R y is trivial: 

n 

nRy > H(F y ) > I(F y ;Y n ) — ^ I(Y. l ; F^Y^ 1 ) 

n n 

= ^/(r 4 ;^,r J - 1 ) = ^/(y 4 ;[/ I ). 

i=l i=l 

Similarly, we obtain the first lower bound on Am: 

n 

n(A M + e)> I(Y n ; F x , F y ) > I(Y n ; F y ) = ^ I(Y f , U t ). 

i=l 

The second lower bound on A m requires slightly more work, 
and can be derived as follows: 

n(A M + e)>I(Y n ;F x ,F y ) 

= I(Y n ;X n ,F y ) + I(X n ;F x ,F y )- I(X n ;F x ,Y n ) 

> I(Y n ; X n , F y ) + nA A ~ I(X n ; F x , Y n ) - ne (7) 

n 

> FyYf, X n , Fy\Y i ~ l ) + uAa - H(X n ) - ne 

i=l 
n 

> Y I(Yi; X t , Ui) + A A - H(Xi) - e, 

i=\ 

where (|7]i follows since amplification A^ is achievable. 

Observing that the Markov condition Ui <-» Y t o is 
satisfied for each i, a standard timesharing argument proves 
the existence of a random variable U such that U Y X 
forms a Markov chain and Q is satisfied. 

Direct Part: Fix p(u\y) and suppose (R x , R y , Aa, Am) 
satisfy ([3| with strict inequality. Next, fix e > sufficiently 
small so that it is less than the minimum slack in said 
inequalities, and set R — I(Y; U) + e. Our achievability 
scheme uses a standard random coding argument which we 
sketch below. 



Codebook generation. Randomly and independently, bin 

the typical x n 's uniformly into 2 n ( A - 4_/(X;C/ ) +c ) bins. Let 
b{x n ) be the index of the bin which contains x n . For I e 
{1, . . . , 2 nR }, randomly and independently generate u n (l), 
each according to YYi=i Pu (ui) ■ 

Encoding. Encoder 1, upon observing the sequence X n , 
sends the corresponding bin index b(X n ) to the decoder. If 
X n is not typical, an error is declared. Encoder 2, upon 
observing the sequence Y n , finds an L € {1, ... , 2 nR } such 
that (Y n ,U n (L)) are jointly typical, and sends the unique 
index L to the decoder. If more than one such L exists, ties 
are broken arbitrarily. If no such L exists, then an error is 
declared. 

This coding scheme clearly satisfies the given rates. Further, 
each encoder errs with arbitrarily small probability as n — > oo. 
Hence, we only need to check that the amplification and 
masking constraints are satisfied. To this end, let C be the 
random codebook. We first check that the amplification and 
masking constraints are separately satisfied when averaged 
over random codebooks C. 

To see that the (averaged) amplification constraint is satis- 
fied, consider the following: 

I(X n ;F x ,F y \C) = H{X n \C) - H(X n \b(X n ), L,C) 
> nH{X) - n(H(X) - A A + 6(e)) (8) 
= n(A A -S(e)), 

where ([H} follows since X n is independent of C and, averaged 
over codebooks, there are at most 2 n ( H ( x )-&A+fi(e)) se _ 
quences x n in bin b(X n ) which are typical with U n (L), where 
L G {1, . . . , 2 nR }. The details are given in the Appendix. 

We now turn our attention to the masking criterion. First 
note the following inequality: 

I(Y n ; F x , F y \C) = I(Y n ;L\C) + I(Y n ;b(X n )\L,C) 

< I{Y n ; L\C) + H(b{X n )\C) - H(b(X n )\Y n ,C) 

= I(Y n ; L\C) + I(X n ; Y n ) - H{X n ) + H(b(X n )\C) 

- H{b{X n )\Y n ,C) + H(X n \Y n ) 

< I{Y n ; L\C) + I{X n ; Y n ) - H(X n ) + H(b(X n )\C) 

- I(b(X n );X n \Y n ,C) + H(X n \Y n ) 

= I(Y n ; L\C) + I{X n ; Y n ) - H(X n ) + H(b(X n )\C) 
+ H(X n \Y n ,b(X n ),C) (9) 

Two of the terms in |9| can be bounded as follows: First, since 
L e {l,...,2 nR }, we have 

I(Y n ; L\C) <nR = n{I(Y: U) + e). 

Second, there are 2 7 ^ Aa ~ i( - X ' U '> + ^ bins at Encoder 1 by 
construction, and hence H(b(X n )\C) < n(A A -I(X; U)+e). 
Therefore, substituting into (|9| and simplifying, we have: 

I(Y n ;F x ,F y \C) < n{I{Y;U,X) + A A - H{X)) 

+ H(X n \Y n ,b(X n ),C)+n2e. (10) 



We now consider three separate cases. First, assume < 
I(U;X). Then, 

I[Y- X, U) + A A — H(X) < I(Y; X, U) - H(X\U) 

= I(Y;U) - H(X\Y), 

and ( fT0] > becomes 

I(Y n ; F x , F y \C) < nI(Y; U) - I{X n ; b(X n )\Y n ,C) + n2e 
< nI(Y; U) + n2e. 

Next, suppose that > I(X: U) + H(X\Y). In this case, 
there are greater than 2 n ^ H ^ x ^ +< ^ bins in which the X n 
sequences are distributed. Hence, knowing Y n and b(X n ) 
is sufficient to determine X n with high probability (i.e., 
we have a Slepian-Wolf binning at Encoder 1). Therefore, 
H(X n \Y n ,b(X n ),C) < ne, and ((lOjl becomes 

I(Y n ; F x , F y \C) < n(I(Y; X, U) + A A — H{X)) + n3e. 

Finally, suppose A A = I(X;U) + 9H(X\Y) for some 
9 E [0,1]. In this case, we can timeshare between a code 
C\ designed for amplification A^ = I(X; U) with probability 
9, and a code C 2 designed for amplification A' A = I(X: U) + 
H(X\Y) with probability 1 — 9 to obtain a code C with the 
same average rates and averaged amplification 

I(X n ;F x ,F y \C) 

= 9I(X n ; F x ,F y \C 1 ) + (1 - 9)I(X n ; F x , F y \C 2 ) 
> n{I(X; U) + 9H(X\Y) - 6(e)) = n(A A - 5{e)). 

Then, applying the inequalities obtained in the previous two 
cases, we obtain: 

I(Y n ;F x ,F y \C) 

= 9I(Y n ;F x ,F y \C 1 ) + (1 - 0)1 (Y n ; F x , F y \C 2 ) 

< 9nI(Y; U) + (1 - 6)n(I(Y; X, U) + A' A - H{X)) + 3ne 

= nI(Y; U) + 3ne. 

Combining these three cases proves that 

-I(Y n -F x ,F y \C) 
n 

< max{I(Y; U, X) + A A — H(X), I(Y; U)} + 3e 

< A M + 3e. 

To show that there exists a code which satisfies the am- 
plification and masking constraints simultaneously, we con- 
struct a super-code C of blocklength Nn by concatenating 
TV randomly, independently chosen codes of length n (each 
constructed as described above). By the weak law of large 
numbers and independence of the concatenated coded blocks, 

Pv I << ■ -J^I(X Nn ;F X7 F v \C = c)> A A -5(e)X) > 3/4 



Nn 
1 



PtUc: j^ n I(Y" n ;F x ,F y \C = c)<A M +6(e)j) > 3/4 

for N and n sufficiently large. Thus, there must exist one 
super-code which simultaneously satisfies both desired con- 
straints. This completes the proof that (R x , R y , A A , Am) is 



achievable. Finally, we invoke the Support Lemma J6] to see 
that \y\ — 1 letters are sufficient to preserve p(y). Plus, we 
require two more letters to preserve the values of H(X\U) 
and I(Y;U\X). ■ 

Proof of Corollary [7} By setting A A = H(X), (TJ 
Theorem 2] implies that X n can be reproduced near losslessly. 
A simplified version of the argument in the direct part of 
the proof of Theorem [T] shows that the masking criterion will 
be satisfied for the standard coding scheme. The converse of 
Theorem [T] continues to apply ■ 

Proof of Theorem [2]- First, we remark that the strength- 
ened version of |9j Theorem 6] states that 1Zaa(Rx, Ry) 
is the closure of pairs (A^, Ay) such that there exists a 
(2 n -R x ^ 2 nR y , n) code satisfying 

Ax < -I(X n ;f x (X n ),f y (Y n )) + e, 
n 

A Y <h(Y n -J x (X n )J y (Y n )) + e 
for any e > 0. 

Suppose (Ax, Ay) e TZ*(R x ,R y ). By definition of 
1Z*(R X , R y ), TheoremJT] and the above statement, (Ax, Ay) 
also lies in each of the sets 1Zam(Rx, Ry), 7&ma(Rx, Ry), 
and 7Zaa(Rx, Ry)- Since each of these sets are closed by 
definition, we must have 

W(R X ,Ry) C 

TZam(Rx, Ry) n TZ M a{Rx, Ry) H TZaa(R X , Ry)- 

Since each point in the sets 1Zam(Rx, Ry), T^ma(R x , Ry), 
and 1Zaa(Rx, Ry) is achievable by vector quantization and/or 
random binning, the second statement of the Theorem is 
proved. 

To show the reverse inclusion, fix e > and suppose 

(Ax, Ay) 6 n A M(Rx,R y )mZMA(Rx,Ry)mZAA(Rx,Ry)- 

This implies the existence of [2 nAMRx , 2 nAAlR y , n AM ), 
(2 nMAR * ,2 nMAR y ,n M A), and {2 nAAR * ,2 nAAR v ,n AA ) codes 
satisfying: 

1 



Ax 

Ay 

Ax 



<- I(X nAM • f AM (x nAM ) j AM (Y TlAM )) 
nAM x v 

> j(Y nAM ■ f AM (x nAM ) f AM (Y nAM ))- 



> 



nAM 
1 



I(X n 



j rHx n AM)J AM {Y 

: 1 . J MA (X 7lMA ) f MA (Y 



UMA 



)) 



riMA 

Ay < — - — J(Y nMA ; A (X nMA ) , f MA (Y nMA ) ) 



< 



Ay < 



n MA 
1 

n AA 
1 

nAA 



I(X nAA ;f* A (X nAA ),f£ A (Y nAA ))+e 



I{Y nAA ;f AA {X nAA ), f AA {Y nAA )) + 



Also, by taking f x IM , f^ 1 ^ to be constants, we trivially have 

a (2 nMMR * ,2 nMMR y ,Umm) code satisfying 
1 



A x > 



j(X nMM ■ j MM ( x nMM ) f MM (Y nMM )) 



MM, 



nMM 

^ Y > j^yriMM . jMM (j^nuM \ jMM /ynMKjj 



It is readily verified that, by an appropriate timesharing be- 
tween these four codes, there exists a (2 ni?x , 2 nRy , n) code 
satisfying 



1 



A x -~I(X n ;f x (X n ),f y (Y n )) 
n 

A Y --I(Y n -J x (X n )J y (Y n )) 
n 

This completes the proof of the theorem. 



< S(e), and 
<S(e). 



IV. Concluding Remarks 

In this paper, we considered a setting where two sepa- 
rate encoders have access to correlated sources. We gave a 
complete characterization of the tradeoff between amplifying 
information about one source while simultaneously masking 
another. By combining this result with recent results by 
Courtade and Weissman J9|, we precisely characterized the 
amount of information that can be revealed about X n and 
Y n by any encoding functions satisfying given rates. There 
are three notable points here: (i) this multi-letter entropy 
characterization problem admits a single-letter solution, (ii) 
restriction of encoding functions to vector quantization and/or 
random binning is sufficient to achieve any point the region, 
and (iii) this simple characterization does not extend to three 
or more sources/encoders. 

Finally, we remark that in the state amplification and 
masking problems considered in |4| and |5|, the authors obtain 
explicit characterizations of the achievable regions when the 
channel state and noise are independent Gaussian random 
variables. Presumably, this could also be accomplished in our 
setting using known results on Gaussian multiterminal source 
coding, however, a compete investigation into this matter is 
beyond the scope of this paper 
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Appendix 

Lemma 1: With all quantities defined as in the proof of 
Theorem [T] 

limsup -H(X n \L,b(X n ),C) < H(X) - A A + 5{e). 

n— > oo n 



Proof: We follow the proof strategy of 1 10 Lemma 



22.3] and make adjustments where necessary. For convenience, 
define R x = Aa — I(X; U) + e and recall that e was chosen 
sufficiently small so that R x < H(X\U). Note that we can 
express the random codebook C as a pair of random codebooks 
C = (Cb,Cvq), where Cb is the "binning codebook" at 
Encoder 1, and Cyq is the "vector-quantization codebook" 
at Encoder 2. 



Let E x = 1 if (X n , U n {L)) £ % (n) and E x = otherwise. 
Note that Pr({£i = 1}) tends to as n — > oo. Consider 

H(X n \L,b(X n ),C) 

< H(X n ,E 1 \L,b(X n ),C) 

< 1 + nPr({E 1 = 1})H(X) 

+ p(hb,c VQ \E 1 = 0) 

(l,b,c VQ ) 

x H(X n \L = l,b(X n ) =b,Ei = 0,C VQ = c VQ ,C B ). 

Now, let N(l,b,cvQ,CB) be the number of sequences x n E 
B(b) n Te n] {X\u n {l)), where B{b) denotes the bin of re- 
sequences which is labeled by index b and u n (l) is the 
codeword in the (fixed) codebook cvq with index I. Note 
that N(l,b,cvQ,CB) is a binomial random variable, where 
the source of randomness comes from the random codebook 
Cb- Define 

E 2 (l,b, c V q,C b ) 

1 if N(l, b, c VQ ,C B ) > W[N(l,b,cv Q ,C B )], 
otherwise. 

Due to the binomial distribution of AT(Z, 6, cvq,Cb), it is 
readily verified that 



E[N(l,b,c VQ ,C B )} = 2- nR * 
VK(N(l,b,cvQ,C B ))<2- nA * 

Then, by the Chebyshev lemma JTO] Appendix B], 

Vax(N{l,b, cvq,C b )) 



T^{X\u n (l)) 

r} n \x\u n (ij) 



Pr({E 2 (l,b,c VQ ,C B ) = 1}) < 
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(E [N(l,b,c VQ ,CBW 

< 2-n(H(X\U)-R x -5(e)) 

which tends to zero as n — > oo if R x < H(X\U) — (5(e), 
which is satisfied for e sufficiently small. Now consider 

H(X n \L = l,b(X n ) =b,E! =0,C VQ =c VQ ,C B ) 

< H(X n , E 2 \L = I, b(X n ) =b,E x = 0, Cvq = c VQ ,C B ) 

< l + nPr({£ 2 = i})H(X) 

+ H(X n \L = l,b(X n ) =b,Et =0,E 2 = 0,C VQ = c VQ ,C B ) 

< l + nPr({£ 2 = 1})H(X) 
+ n(H(X\U)-R x + S(e)), 
which implies that 

H(X n \L,b(X n ),C) 

< 2 + rc(Pr({£Ji = 1}) + Pr{{E 2 = 1}))H(X) 
+ n(H{X\U)-R x + 6{e)) 

< 2 + n(Pr({£Ji - 1}) + Pr({£ 2 = 1}))H(X) 
+ n(H(X)-A A + S(e)). 



Taking n — > oo completes the proof. 



